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QUANTITATIVE MODELING AND ANALYSIS IN 
ENVIRONMENTAL S TUD IE S 



Donald P. Gaver 

Department of Operations Research 
Naval Postgraduate School 
Monterey, CA 
93943 



When you can measure what you are speaking about, and express it in 
numbers, you know something about it; but when you cannot measure it, when you 
cannot express it in numbers, your knowledge is of a meager and unsatisfactory 
kind: it may be the beginning of knowledge, but you have scarcely, in your 
thoughts, advanced to the stage of science . 



William Thompson 
Lord Kelvin 



"Art is the lie that helps us see the truth, " said Picasso, and the same 
can be said of modelling . On seeing a Picasso sculpture of a goat, we are 
amazed that his caricature seems more goatlike than the real animal, and we 
gain a much stronger feeling for "goatness." Similarly, a good mathematical 
model — though distorted and hence "wrong" like any simplified representation 
of reality — will reveal some essential components of a complex phenomenon . 

Lee A. Segel 



The mathematicians are a sort of Frenchmen: when you talk to them, they 
immediately translate it into their own language, and right away it is 
something utterly different . 



Goethe 



Key Words: systems, mathematical models, groundwater models, exposure models, 
dose-response models, pharmacokinetics, pharmacodynamics cancer models, model 
validation, risk assessment 



INTRODUCTION 



1 . 

Most biological systems are complex, being made up of many subtly 
purposeful, interacting parts. Whenever such complex systems interact with 
each other or the environment it is useful, or even essential, to introduce a 
simplified way of thinking about and expressing the possible outcomes . To do 
so is to employ a model . In this chapter we discuss and illustrate some of the 
types of mathematical models that have been developed and found informative by 
those who study and attempt to control biological and physical environmental 
systems. In particular, we examine mathematical models for the interaction, 
eventual disposition or fate, and effect, of biological and chemical agents 
that have been released into the physical environment by mankind, have 
dispersed and accumulated, are potentially harmful to the natural environment, 
including mankind, and hence are candidates for remediation. 

Biological scientists are more accustomed to work with and think in terms 
of physical biological models, e.g. laboratory animals, than with the 
quantitative mathematical models that are our topic. However, the use of 
mathematical models in biology has a long and honorable history: early 
examples include population growth models by Volterra and Lotka, and the 
genetic models of Mendel, Fisher and others. An excellent single reference is 
the book of Murray (1980) . Statistical models that represent individual 
physical variations, such as in height, weight, blood pressure and pulse rate 
and many other physiological characteristics, are in routine use, as are 
models that describe human response to doses of drugs, medicines, or exposure 
to toxic agents in various forms and concentrations. These latter will be 
reviewed in this chapter. 
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Environmental problems encountered by mankind typically involve control 
of the production and emission, dispersion, interactions, location, fates, and 
effects of numerous biological, chemical, and physical (such as ionizing 
radiation) components within the environment. Understanding and control of the 
many interrelated processes involved is well beyond the scope of simple field 
experimentation alone, just as it transcends the boundaries of the single 
traditional scientific disciplines and technologies. An overall logical 
framework is needed to assemble the various system components so that options 
can be expeditiously compared on the bases of costs and effectiveness. 
Mathematical, often computer-activated, models are a useful, and increasingly 
utilized, tool for economically exploring the cost and effectiveness of 
different options for environmental usage and control. This exploration makes 
use of scientific information and available data pertinent to the particular 
situations of concern, for example at those sites that are candidates for 
remediation. In addition, modeling efforts now guide the collection of 
suitable meaningful data, and help to organize and focus their summary and 
analysis. Such focusing is accomplished by initially combining data and 
theoretical understanding from the various relevant scientific disciplines and 
technologies in an attempt to address meaningful questions that are generally, 
if simplistically, of the form "If option X is used for remediating hazardous 
waste site Y, what will be the cost, how long will be required, and what will 
be the change in the habitability, or risk, associated with the site?" It is 
increasingly recognized that there is considerable uncertainty attached to 
quantitative responses of the types described. Thorough and careful appraisal 
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and communication of that uncertainty to policy makers and the public is the 



responsibility of the growing fields of risk analysis and risk communication . 

This chapter considers several of the generic modeling areas encountered 
in environmental studies. No attempt is made to be detailed or encyclopedic. 
We attempt to present the flavor of the area and to put the reader into 
contact with relevant literature and references thereto. The aim is to provide 
an overview of models for certain of the many major components of a risk 
characterization analysis or study. 

A graphical flow chart or influence diagram appears as Figure 1. 



INFLUENCE DIAGRAM 
FOR 

FLOW OF CONTAMINANTS 
(EXTERNAL TO CLIENTS) 
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The links between the round nodes show the direction of influence of 



contaminants that enter the environmental system and are transported to such 
client populations as humans, animals, and plants. We use a dotted arrow to 
denote the influence of client populations, such as humans, upon the 
contaminant sources. Needless to say, models for the behavior of all sources 
and links will not be exhaustively discussed in this chapter. 

Section 2 outlines modeling issues connected with groundwater , a vital 
resource, and also a primary medium of transport of pollutants. The history of 
models in this area is long, and the subject is highly technical, requiring 
the intellectual tools of physics and engineering, applied mathematics, 
numerical analysis and statistics, as well as chemistry and biology. 
Groundwater is one of the many media by which human beings, plants and animals 
come into contact with chemical and biological pollutants and toxicants. 
Maintenance of its quality is a matter of primary concern to the United States 
Environmental Protection Agency. 

In Section 3 we consider compartment al and physiologically-based 
pharmacokinetic models for the description of the transport patterns of 
concentration of toxic agents, and also medicines, between and within bodily 
organs. Thus, given exposure to various chemicals via drinking, cooking and 
washing water (but also through air, food, etc.) it is natural to ask about 
their ultimate concentrations in the blood that enters such organs as the 
liver or kidney. Pharmacokinetic models describe the time-dependent 
relationship of that concentration, or dosage, in terms of exposure routes to 
the host organism, e.g. a human being, and the scheme by which blood flows, 
and its contents modified, in that organism on the way to target organs. 
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Intake from groundwater provides one of the many exposure routes referred to 
above . 

Section 4 provides an introductory account of the active new field of 
exposure monitoring and modeling . Reflection suggests that careful effort is 
needed to relate the origin and subsequent transport of pollutants, e.g. 
through groundwater, to the exposure and dosage of humans and other endpoints, 
such as the DNA adducts discussed in another chapter. 

The topic of Section 5 is that of dose-response models . Such models 
connect dosage levels, i.e. concentrations of toxic chemicals or biological 
agents, to response at the organ, or host organism, level; response may be in 
terms such as illness or death of the organism or modification of an organ, 
i.e. the proliferation of cells or the conversion of healthy cells to 
precancerous or cancerous. Such events are called endpoints; current attention 
appears to be focused on increasingly more biologically basic, and subtle, 
endpoints. The mathematical models that have been proposed reflect biological 
phenomena to as great a degree as possible, but that degree is limited in 
practice by the complexity and imperfect knowledge of the variety of 
biological processes by which alien substances actually alter organs. Figure 2 
describes the interactions envisioned, and modeled, in the categories alluded 
to above. 

Section 6 discusses some of the issues associated with model-based risk 
assessment. The challenge is to represent multi-stage organ alteration 
processes credibly by models that are simple enough to be estimated from 
available and relevant data. Major interest focuses on the effects of chemical 
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INFLUENCE DIAGRAM 
FOR 

FLOW OF CONTAMINANTS 
(INTERNAL TO HUMAN AND ANIMAL CLIENTS) 




or biological agents on human beings. Often dosages of such agents are low, 
but prolonged, although response to sudden impulse dosage can be of concern, 
e.g. that caused by accidental spill. Direct experimental human response 
information is seldom available, so indication of effect is often sought in 
animal experimentation; animal includes rodents, but also fish, frog embryo, 
and many other biological models. For reasons of economy and time such 
experimentation are often done at relatively high doses. It is current 
practice to extrapolate such animal data to indicate the animal responses at 
low doses, and across species to a corresponding response in man. The 
extrapolations are typically made in terms of mathematical models. Such 
assessments are uncertain and vulnerable for various reasons, and the results 
have been questioned. The techniques that have been proposed to circumvent the 
criticisms seem currently to be two-fold; to increase the size of the animal 
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experiments, thus allowing more confident determination of low-dose effects, 
and to improve upon the biological credibility of the basic models. However, 
very large animal studies are costly, and hence rare. References are given 
later to the current discussion on extrapolation problems. 

Section 7 outlines, in two epidemiologic case studies, issues that were 
confronted when quantitative approaches, including the use of mathematical and 
statistical models, were employed to analyze risk to humans exposed to 
chemically polluted drinking water. Techniques are discussed, as are problems 
associated with interpretation of results. 

The various sections described above ultimately pertain to risk 
assessment , characterization, and analysis . The task of risk assessment, 
characterization, analysis, and management, is to estimate the effect on 
environmental clients, e.g. humankind but not exclusively so, of admission of 
certain chemicals and biological agents into the environmental system. Such 
admission may be appealing in many ways, for instance economically, but the 
appeal must be balanced against possible adverse effects. Risk analysis 
attempts to quantify such effects in an atmosphere of considerable 
uncertainty. In each of the sections above we point to efforts made to 
quantify the uncertainties inherent in the modeling efforts and their 
applications. Numerous references are presented to supplement the discussion. 
There remains much to be done to reduce and characterize such uncertainty. 

2 . MODELING GROUNDWATER: A SPECIFIC SYSTEM RELEVANT TO REMEDIATION ISSUES 

Water is essential for sustaining life on Earth. It is also the medium 
that transports many of the pollutants that are introduced into the natural 
environment. Groundwater, which is the earth surface's source of most of the 
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water that humans come into contact with, is an element of the hydrosphere, 
which itself acts as a system. The latter encompasses all waters above, on, 
and below the surface of the earth. Water moves cyclically through this closed 
system, being exchanged from state to state by evaporation, precipitation, 
plant transpiration and other processes including human usage and consumption : 
i.e. it enters the groundwater subsystem in recharge zones and leaves in 
discharge areas. While doing so it is consumed by humans, plants, and animals, 
contributes to disposing of their waste products, and is faced with 
increasingly many demands as a medium for disposing of agricultural and 
technological byproducts. 

Groundwater modeling refers to the description, in quantitative numerical 
terms, of the flow, and residence or storage, of water into, and out of, 
regions near the surface of the earth. Although such flow is of natural origin 
it is affected by man's activities such as pumping out of and into those 
regions. See van der Heijde et al. (1988), Schwartz et al . (1990), Anderson 
and Woessner (1992), and numerous articles in the journal Water Resources 
Research and elsewhere. The objectives of modeling are both to build a 
scientific basis for understanding the many interacting processes involved and 
to provide specific information for managing an essential, scarce, resource. 
Public concern is with the adequate supply of water for human consumption and 
usage, and increasingly with the quality of that supply. Models help to inform 
policy makers of the effects of regulations placed on water usage for direct 
human consumption and waste disposal, and also on the impacts of usage for 
waste disposal by the agricultural, manufacturing, electric power. 
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transportation and defense subsystems of the economy. They provide information 
concerning the status and cost-effectiveness of remediation efforts. 

Models for such purposes use geological, chemical and biological science 
and principles of fluid mechanics to describe the time and spatially varying 
and interacting flows of relevant fluids: fresh water in aquifer containment, 
but also, where relevant, the extent of intruding salt water, petroleum, and 
solutions of chemicals such as fertilizers and pesticides, radionuclides, and 
other items. Specialized multicomponent models are used to assess, the 
chemical-biological content of groundwater: important classes are solute 
transport models and pollutant transformation and degradation models . In 
these, the processes of change of transported or resident pollutants are 
described and predicted, using appropriate chemical and biological science. An 
important objective is to track the availability for consumption by mankind of 
various substances that enter the water cycle at various remote points and 
reach aquifers that yield drinking water. 

Since groundwater carries contaminants, its motion is of primary 
importance. That motion is best understood and mathematically modeled in fully 
saturated regions inhabited by porous materials such as rock or soil. Aquifers 
are such regions: they store water accessed by wells that furnish water for 
consumption. Widely accepted mathematical models for aquifer flow are based on 
physical conservation laws that govern fluid flow; these models are called the 
Navier-Stokes equations. For use in the porous media groundwater environment 
these in turn may be simplified by use of the semi-empirical Darcy law; see 
Schwartz et al. (1990), Chapter 3, for a discussion of the extent of this 
law's applicability. The result is a linear second-order partial differential 
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equation for pressure head that involves a hydraulic conductivity "parameter" 
(actually a function of space as that reflects regional variation) , a storage 
term, and a source-sink term. The latter represents natural flow into and out 
of the aquifer via, say, rainfall, plus the influence of pumping for direct 
usage plus recycling remediation, i.e. the recharge and discharge areas 
previously mentioned. This setup is called the porous media model. Given these 
data requirements, plus boundary and initial conditions, partial differential 
equations can be solved numerically to describe pressure head as a function of 
space and time throughout the aquifer. From head information, flow velocities 
can be derived, and these allow prediction of the concentration of a dissolved 
chemical solute at a specified place and time in an aquifer, since the 
chemical solution is regarded as being largely transported by the process of 
advection, i.e. flow along with the general groundwater. In addition, the 
solute disperses ; the dispersion flux is sometimes viewed as depending on the 
current concentration gradient (Fick's law). Further models are needed, and to 
some extent available, for describing chemical reactions, abiotic 
transformation, and biological processes between many possible inorganic and 
prganic solute types; see Schwartz et al. (1990), Chapter 4. Basic spatial and 
temporal change in solute concentration is also modeled by second-order 
partial differential equations, coefficients of which depend on the flow 
rates, obtained as sketched earlier, and on local dispersivities . 

Despite the availability of the fundamental fluid-mechanical science on 
which modeling of flows in saturated media can be based, various 
approximations are required in practice. Hydraulic conductivity "parameters" 
may not be conveniently nearly constant, much less well-known, over the entire 
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region of interest, so that in practice regions are divided into 
hydrostratigraphic units (regions of similar hydrogeological properties) 
within each of which nearly the same parameter values are assumed to prevail; 
see Anderson and Woessner (1992), Chapter 3, for practical details. It has 
been noted that good prediction of dispersive effects depends on accurate 
calculation of the spatially varying velocity field to a high degree of 
resolution, which is difficult because of lack of information concerning 
influential local variations in hydraulic conductivity. It is thus difficult 
to make highly accurate predictions of contaminant concentration at various 
locations in an aquifer, e.g. near a well location. 

Contaminants typically enter the saturated regions addressed earlier 
through unsaturated zones, e.g. through soil, possibly from near-surface 
disposal, or spills, of contaminants. The type of liquid saturation in these 
zones is water or non-aqueous-phase liquids, plus gases. Although Darcy's law 
remains approximately applicable in unsaturated flow the hydraulic 
conductivity parameter, K, is now a function of the head, so the partial 
differential equation becomes non-linear, increasing difficulties with 
numerical solutions. Several approximate analytical forms have been proposed 
to describe K *= K(h); see Schwartz et al. (1990), Chapter 3. Furthermore, 
unsteady and transient flows in the unsaturated zones are the rule, whereas 
the flow in saturated regions tends to be steady. These features further 
complicate model parameter specification and equation solution. Transport of 
contaminants in unsaturated regions also is subject to a greater number of 
physical and biological processes that are interactive, complex, and less 
well-understood than are those in the saturated regions. 
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Still more serious complications arise when modeling the flow of water 
and dissolved contaminants through rocky media that is extensively fractured. 
Such fractures form haphazardly placed channels with apertures of varying 
sizes, and hence flow capacities. The hydraulic conductivities of fractured 
media may actually change with applied stress, so that pumping from a well in 
such regions can alter the effective porosity of the region changing its 
flows. Various modeling options have been employed to handle the fractured- 
media flow problem. A first approach is to make a continuum approximation: one 
represents fractured media flow as that in an equivalent porous media, 
defining a hydraulic conductivity parameter to be incorporated into the 
aforementioned partial differential equation; this is now called an equivalent 
porous media model. This approach is inaccurate if the region is connected but 
very sparsely fractured; i.e. if a small local region is being considered. A 
second approach assumes dual porosity, meaning that the porosities of the rock 
matrix and of the fractures are separately identified, as are flows within and 
between rock and fractures; the model is now two coupled partial differential 
equations and is referred to as a dual porous media model. Finally, if 
fracture flow predominates (rock is relatively impermeable) , the flow is 
modeled as occurring through a network; this is the discrete fracture model. 
Practical uncertainties exist as to an actual fracture configuration, which 
add to the uncertainty of prediction of flows or water plus transported 
pollutants . 

Various methods of ascertaining properties of underground regions are in 
use so that roughly appropriate choices of flow and transport models can be 
made. These include evidence from bore holes, the application of kriging 
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(interpolation between spatially separated observations, see Cressie (1991)), 
use of tracer information, inverse problem solution (inference of concealed 
hydraulic parameters from head and flow observation), and others; see Anderson 
and Woessner (1992), Chapter 8, on the calibration process, wherein a model is 
adjusted to fit available data and the success of the fit is examined. 
Uncertainty, Variability and Stochastic Models 

Sizable heterogeneity and irregularity of media through which water — 
and contaminants — pass and are stored has prompted the development of 
probabilistic or stochastic (=chance=random) models to supplement the earlier 
deterministic versions. For example, Gelhar (1986) treats the logarithm of the 
hydraulic conductivity, K, as a spatially correlated Gaussian random process, 
which he then uses to characterize the induced head statistics, e.g. its 
variance. He utilizes mathematical techniques to deduce that "the large-scale 
transmissivity (hydraulic conductivity) of an aquifer is obtained by averaging 
the logarithms of local transmissivities that are measured." It is stated that 
the same approach can be used to evaluate an effective large-scale dispersion 
coefficient (the "macrodispersivity" ) relevant to solute transport. More 
recent work by Glimm and Sharp (1991), and by Zhang (1992), expand upon the 
characterizations of heterogeneous porous media — the habitat of 
groundwater — as general random fields. The assumption of the log-Gaussian 
(or log normal) distribution for hydraulic conductivity goes back at least to 
Freeze (1975) . It is legitimate to ask about the sensitivity of results based 
on this approach to its assumptions, i.e. that of Gaussian distributions, for 
in other areas there occur extensively tail-dispersed and otherwise non- 
Gaussian distributions. 
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Note that randomness may enter the groundwater picture not only by way of 
media characterization, as above, but through representation of the recharge 
process, which is influenced by the irregular occurrence of rain, snow, and 
heat. Chemical and biological processes occurring in the subsurface 
environment are also so affected. In summary, significant fluctuation and 
variability (sometimes called structural randomness) may well occur in media, 
recharge and discharge, and in the properties of the items transported 
therein. Early models ignored such effects: more modern models include such 
realisms . 

An additional component of uncertainty or randomness results from errors 
inherent in the measurement process for determining properties of the media 
and inputs and outputs at particular sites from observations; this component 
may be called measurement error or statistical randomness . In the literature, 
see Anderson and Woessner (1992), measurement errors are also frequently 
characterized by a Gaussian or normal distribution: the familiar "bell-shaped 
curve." However, an extensive statistical literature on robustness suggests 
caution; see Tukey (1984), p. 614. In particular, care should be taken in the 
application of ordinary least-squares regression techniques : these tend to be 
undesirably responsive to isolated outlying observations. Such comments 
potentially apply to kriging, a technique for smoothing and interpolating 
noisy spatial observations that is increasingly applied in groundwater 
studies. Cressie (1991) considers robust kriging , a technique that down- 
weights isolated maverick observations that are the result of aberrant 
measurements . 
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Since both structural randomness and random measurement errors seen 



plausible and have been invoked, their blending in a Bayesian approach 
suggests itself, see Berger (1985) . Such an approach is briefly mentioned by 
Anderson and Woessner (1992) who refer to work by Freeze et al. (1990) . 
Briefly, Bayesian theory (named for the Reverend Thomas Bayes, 1763) , see 
Berger (1985) , represents uncertainty in a physical or biological parameter by 
calculating its probability distribution, the so-called posterior 
distribution. The components of this distribution are a prior distribution, 
which incorporates general information about the unknown value obtained from 
other situations and expert judgment, as well as a likelihood function that 
represents the information given by measurements on the specific situation of 
concern. For the mechanics, see Berger (1985) . The Bayesian methodology is 
capable of formally incorporating information from other, similar, sites into 
the estimation of properties of a site of current concern, as well as 
measurements taken at the latter. The classical statistical approach omits 
formal treatment of information other than that obtained at a particular site. 
There apparently exist computer codes for carrying out such processes; several 
references are given by Anderson and Woessner (1992), Chapter 8. A Bayes 
version of kriging exists; for an account again see Cressie (1991) . 

Another approach to the uncertainty problems in the groundwater arena is 
that of expert systems, or more generally artificial intelligence (Al) . 
Several publications and expert system tools such as knowledge bases and 
inference engines are referenced in Schwartz et al. (1990), Chapter 7. For a 
way into Al and uncertainty ideas, see Pearl (1988) . 
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Remediation of contaminant content of groundwater is often attempted by 
pumping fresh (low or uncontaminated) water into and out of aquifers in an 
attempt to dilute and flush out existing contamination. The effectiveness of 
this methodology depends upon the nature of the contaminant; if the 
contaminants are multiphasic non-soluble (immiscible) and have been trapped in 
pores of rocks or soils their removal by pumping technology is extremely slow; 
see Travis (1992) . 

3 . COMPARTMENT AND PHARMACOKINETIC MODELS 

Models that envision either the physical environment or an animal or 
human body as a collection of inter-linked but homogeneous compartments are 
called compartment or pharmacokinetic models. They behave as follows: a 
substance enters one or more compartments according to a specified time 
pattern; once in the compartment it resides there for a characteristic time 
before it changes form, disappears, or passes to another compartment, where 
the process continues; cycling back and forth between compartments may occur. 
The substance may in fact be a combination of more elementary substances. The 
agent that carries the specified substance can be blood or other bodily 
fluids, so the amount present is naturally expressed as a concentration. In 
the case of animal or human body compartments, the modeler's goal may be to 
predict the dosage and transformation of a medicine, drug, anesthesia, or 
possibly toxin, at a particular organ, such as the liver. The varying rate at 
which dosage is eventually administered to the target organ depends upon the 
rate of substance entry into the blood stream, e.g. from the lung or stomach, 
and thereafter on the rate of transfer within and between organs that precede 
the target organ in natural order, and also on the blood flow rate. The term 
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pharmacokinetics is used to describe this process when the context is 
generally biological. Classical pharmacokinetic models are expressed in terms 
of systems of linear ordinary differential equations with constant 
coefficients that reflect the rate at which concentration changes in the 
various compartments . Physiologically-based pharmacokinetic models (PB,PK for 
short) utilize physiological/biological interpretations of mechanism to 
specify the equation coefficients, which may not be constant in concentration, 
and may In fact be non-constant. PB,PK models are, theoretically, capable of 
validly representing intraspecies biological responses, and to give promise of 
useful interspecies extrapolations. See Bischoff (1987) for more details. 
Quite recent work by Bois, Gelman, Jiang and Maszle (1994) has applied modern 
Bayesian statistics to fit a pharmacokinetic-physiological toxicokinetic model 
to assess fraction of tetracholoroethylene metabolized at a given dose level, 
taking account of individual variability plus estimation uncertainty. 

Analogous models can be used to describe the flow or transfer of 
dissolved chemicals, pesticides or waste material from the earth's surface 
through soil and rock to groundwater in aquifers. Presumably the predicted 
concentrations of such items in, say drinking or washing water, which may be 
understood as the output of an appropriate environmental component 
( compartment ) model, provide inputs to a pharmacokinetic ( compartment ) model 
eventually describing the dosage of (some transformation of) the dissolved and 
transported chemical to a particular target organ within a human body or 
another organism. This dosage could, in turn, provide input to a 
pharmacodynamic model, e.g. a multi-hit or clonal expansion model, cf . 
Moolgavkar and Luebeck (1990), or Portier (1989) to predict the occurrence of 
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carcinogenetic material in an animal or human organism. A nice overview of 
PB,PK compartment models is given by Andersen (1987). A list of open problems 
awaiting solution to provide improved risk analysis tools is in Rhomberg 
(1988) . More detail and further references occur in Section 5, on dose- 
response models. 

PB, PK models have been tested empirically by several groups of 
investigators. For example, see Andersen, Clewell, Gargas, Smith, and Reitz 
(1987), Reitz, Nolan, and Schumann (1987), who studied methylene chloride, and 
Travis, Quillen, and Arms (1990) who examined benzene. In both of these latter 
cases data from rats, mice and humans were modeled with the objective of 
explaining or relating concentrations of items in question (methylene chloride 
and benzene respectively) and their metabolites in target organs. These 
studies have as an objective the reconciliation of various empirical 
investigations on the basis of plausible biological mechanism. 

The strategy currently followed in applications of PB,PK thinking is, 
first, to identify physiologically meaningful compartments; in the methylene 
chloride case four were used: liver (the primary metabolizer) , fat, organs 
such as brain, heart, kidney, other viscera, and muscle; in the benzene case 
bone marrow was added, for it and liver are the primary organs that metabolize 
benzene. The differential equation coefficient values are then specified from 
previous related experimental studies. Explicit recognition is given to 
nonlinear process-saturation terms such as the Michaelis-Menton expression 
used in modeling metabolism and also to gastric absorption rate as a function 
of dose level. In Travis et al. (1990) the Michaelis-Menton parameters, V max 
and K m were actually determined by fitting the model to existing data. After 
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the model is fully specified it is asked to explain data on concentrations of 
the dosed material or its metabolites at target organs. In both Reitz et al. 
(1987) and Travis et al. (1990) objectives are predictions of responses, i.e. 
concentration in an important organ, or other endpoints, as a function of 
time. In general, agreement appears qualitatively reasonable to good. However, 
some such published comparisons appear to depend on parameter values obtained 
from the data being described, so the term "prediction" is perhaps a bit 
generous. To test true predictive capability of a model it would be necessary 
to apply it to an independent data set: apparently this has been done in some 
cases with reasonable success. 

A good modern introduction to pharmacokinetic modeling is Gillette and 
Jollow (1987) . There the reader will find several references to validation of 
models; although details are not given they can presumably be obtained from 
the authors. 

Uncertainty, Variability and Stochastic Models 

It is clear that the response of even quite similar organisms to 
controlled doses of a medicine or possible toxin will vary somewhat 
inexplicably, suggesting the need for models that enhance the original, 
deterministic, ordinary-differential-equation setups. An initial step has been 
to provide or adapt models to express concentrations as randomly varying 
around a mean. Thus the models represent the number of elementary particles of 
a foreign element, e.g. a chemical, in a compartment as fluctuating randomly, 
and often independently, governed by fixed (and known) transition 
probabilities. This phenomenon can be called outcome variability . Later 
approaches capitalized on the large numbers (or particles) involved by using 
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continuous (diffusion) approximations, cjf. Lehoczky and Gaver (1977) . The 
latter approach has the advantage of close contact with the initial 
deterministic models (the latter gives precisely the mean of the later 
diffusion approximation) and of furnishing a description of random fluctuation 
around that mean that is the familiar Gaussian. However, the scale of random 
fluctuation is on the order of the square root of the mean, which may be 
smaller than that observed in practice. Outcome variability can be negligible 
in practice. 

A second, and possibly dominant, source of variability is in the 
parameters of the differential equation themselves: it is plausible that these 
vary in time within individual organisms, and, possibly more importantly, 
between organisms. This variability is entirely analogous to the structural 
variability mentioned in connection to groundwater in models in Section 2. 
Various researchers have incorporated the between-organism component into 
analyses: Sheiner and Beal (1980) have provided NONMEM (nonlinear Mixed 
Effects Model) which computes an approximate joint Bayesian posterior 
distribution of the various parameters in a given compartment model (this is 
treated as log-multivariate Gauss/normal) and thence to compute the 
(posterior-based) estimate of, say, a concentration at a target organ; PREDPP 
is a package used by NONMEM for this purpose. The latter calculation is 
accomplished either by Monte Carlo simulation ("bootstrapping") from the above 
parameter posterior or by an approximate numerical calculation. Other 
investigators, e.g. Farrar, et al. (1989), have taken the bootstrapping route 
as well. A further step, taken by some investigators, is to invoke Bayesian 
formalism to infer, predict, and control (in the case of a drug or anesthesia) 
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the concentration of a substance at a target organ in a particular individual, 
in the face of uncertainty as to how that individual is processing the 
substance input, and allowing for measurement error; see Bois et al. (1994). 

The above approaches all treat model coefficients as constants that are 
picked from a population described by a distribution but thereafter held 
fixed. To date it appears that little successful work has been done to 
characterize the effect of random temporal parameter variability on target 
concentration variation in an individual organ, although such a model might 
well be more realistic than those discussed. Models exist in which parameters 
change because of the presence of pharmaceuticals or toxins; see Jackson 
(1993) and Gaver, Jacobs, and Carpenter (1994) . 

Although a considerable amount of effort has been devoted to modeling 
uncertainty in pharmacokinetic compartment model parameters, very few of the 
results seem to have been adapted for the use of compartment modelers in the 
environmental sciences. See MacKay and Peterson (1991) who are concerned with 
modeling the environmental fate of organic chemicals (pesticides, PCBs, wood 
preservatives, incineration byproducts, etc.) that, purposefully or 
inadvertently, enter the environment and proceed through soil, water and air 
to expose humans, plants and animals. The effect of uncertainties in the 
various parameters (transport and partition) can be addressed by the 
resampling or bootstrapping approach of Efron and Tibshirani (1986), assuming 
existence of information concerning their distribution. The same is true in 
compartment-model studies of persistent contaminants that concentrate in food 
chains; see Moriarty (1984), who fits a three-compartment model for dieldrin 
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absorption in tissue; stochastic models are mentioned in passing but are not 
actually applied. 

4 . HUMAN EXPOSURE MODELING AND ANALYSIS 

Environmental regulation is intended to protect human public health and 
welfare from adverse effects of environmental pollution. Environmental 
remediation involves reduction of environmental pollution levels to or below 
some tolerable level, and the maintenance of that condition. Thus formal 
regulatory demands and common sense suggest that the level of pollutant to 
which actual human beings are exposed when in a neighborhood of one or more 
polluting sources is of direct relevance. Consequences of experiencing various 
levels of different pollutants via various exposure routes (air, water, food, 
soil) must often be assessed by models: candidates are physiologically-based 
pharmacokinetic models that predict the within-organism transfer of polluting 
chemicals by way of intake from external sources to organs; thereafter one or 
more dose-response , or pharmacodynamic models (see next section) are invoked 
to convert organ-level dosage to biological responses or endpoints such as the 
occurrence of carcinogenic material or other disease forms. 

The importance of quantifying the link between the presence of pollutants 
in a medium and resulting human exposure has stimulated the development and 
use of a number of computer models. In 1987 the Environmental Protection 
Agency established the Center for Exposure Assessment Modeling (CEAM) in 
Athens, Georgia. According to Ambrose and Barnwell (1989), that office 
supplies "predictive exposure assessment techniques for aquatic, atmospheric, 
terrestrial, and multimedia pathways for organic chemicals and metals"; the 
techniques are in the form of computer modeling packages that are made 
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available to users on diskettes. A general description of a number of the 
available models is contained in Ambrose and Barnwell (1989) . Such models are 
generally descriptive of pollutant concentrations in the physical environment 
but do not appear to make a quantitative connection between such 
concentrations and the actual uptake of such pollutants by humans, or other 
inhabitants of the surrounding ecological system. 

Efforts to quantitatively monitor individual human exposure to 
pollutants, e.g. within particular closed spaces or near hazardous sites, are 
described by Ott (1990) . The approach has been called Total Human Exposure 
(THE). Briefly, there are two versions of THE. In the direct approach a 
probability sample of individuals is selected that is representative of a 
particular exposed population. With the aid of personal exposure meters 
carried by those sampled, supplemented by their activity diaries, the attempt 
is made to relate individual exposure to sources of pollution in a great many 
media. In addition, the body burden of various pollutants is measured. The 
results from the probability sample can be applied to make statistical 
inferences concerning population exposure or dose. The latter data can be used 
as input to dose-response models, see Section 5 to follow, so as to infer that 
population's risk. The direct exposure monitoring approach has been called the 
Total Exposure Assessment Methodology (TEAM) ; apparently a number of TEAM 
field studies have been conducted in various cities. Among the reported 
findings is that the number of indoor sources of toxic agents exceeds the 
number of outdoor sources; the dosage levels indoors can also be much higher 
than those outdoors owing to exposure to cleaning fluids, paints and CO 2 in 
enclosed conditions such as homes and offices. 
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The indirect approach to exposure assessment is more truly exposure 
modeling . Field data serve as input to characterize pollutant concentrations 
in various microenvironments (locations of homogeneous pollutant 
concentration) and the randomly varying times spent in these by individuals; 
see Duan (1991) . Such models can possibly be used to augment direct exposure 
assessments to predict human exposure under changed conditions. For more 
references and details see Ott (1990) . 

5 . DOSE-RESPONSE MODELS 

Suppose a dose of a single/ or specified combination/ of chemicals is 
imbibed by a human subject/ or a wild or laboratory animal. It is of interest 
to relate the response of such subjects to the type and level or concentration 
of the dose received. This activity is called dose-response modeling/ and is 
an important part of a quantitative risk assessment. 

It is usual to assess individual response in binary terms: either a 
subject has reached some specified endpoint/ e.g. exhibited tumors of a 
certain type in a target organ at time of observation (in animal cases, 
sacrifice), or it has not. Less frequently, more complicated responses are 
considered; e.g. Zhu, Krewski and Ross (1992), in a developmental toxicology 
context, record and analyze multiple responses of female rats and mice (and 
their offspring) exposed to the toxin, 2,4,5-T. We concentrate here on binary 
responses, although information may well be lost in some cases by doing so. 

If individuals subjected to a specified dosage level are representative 
of a given population some will, and the remainder will not, exhibit a 
positive response. It is useful to think of the fraction that do respond as an 
estimate of the probability that a random number of the population will 
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respond. The relationship between dosage level or concentration, d, and the 
probability of a particular response as a function of d, i.e. P(d) will 
typically appear as shown below in Figure 3: 




Figure 3 

One anticipates that the probability of a positive response, such as 
exhibition of one or more tumors in the liver after sacrifice, will be 
monotonically increasing as depicted, at least in large enough samples so that 
random fluctuations are small. However, reversals do occur and may have 
biological significance, i.e. are not individual sampling variations. 

Mathematical representations of dose-response relationships are of two 
general types. The simple statistical type selects a standard statistical 
model such as the Gaussian/normal "bell-shaped curve", or, alternatively, the 
so-called Weibull distribution first introduced as a descriptor of mechanical 
system failure time, and converts that model into a dose-response model. A 
good introduction to such setups is provided by Kalbfleisch and Prentice 
(1980), who also discuss the fitting of such models to data, accounting for 
complications that occur when data are missing (test animals have died before 
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the planned time of sacrifice)/ and additional data are available to adjust 
for differences in individual age, gender, or environmental conditions such as 
temperature. A particularly effective and popular model of this latter type is 
the so-called Cox regression model; see Cox (1972, 1984) . 

A conceptually appealing alternative and supplement to the simple 
statistical models are called pharmacodynamic models; these attempt to 
mathematically represent some of the detailed biological mechanisms that 
influence organ response. Excellent examples are the multistage cell- 
proliferation clonal expansion models described by Moolgavkar and Venzon 
(1979), Moolgavkar and Luebeck (1991), and Kopp-Schneider, Portier and Rippman 
(1991) . The importance of cell proliferation in the cancer-development process 
has been noted early; a modern account is by Cohen, Portier, and Ellwein 
(1991), and a deterministic dynamic simulation model has been presented by 
Ellwein and Cohen (1992) . There are many papers in this general area. See also 
Hart et al. 1986, p. 217. All of the above models explicitly consider the 
following multistage process that is currently thought to lead to cancerous 
tumors: first, an initiator event occurs randomly at single-cell level and 
causes permanent genetic damage. After such an event, cell division yields an 
increasing number of precancerous clones. The clones so generated may 
independently die and replicate. Promoter events , usually considered to be a 
second gene-damaging event, may lead to the initiated cells becoming a 
cancerous lesion or tumor, a typical dose-response endpoint. 

The above types of models have been converted to dose-response models by 
various authors. The procedure has generally been to express initiation, 
proliferation and completion rates as linear functions of dosages, somehow 
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expressed (any, but often non-decreasing positive, function of time -dependent 
concentration is presumably minimally acceptable); see Crump and Howe (1984), 
Murdoch and Krewski (1988), Kodell, Krewski, and Zielinski (1991) for example. 
All of these approaches resemble each other in that they postulate a simple 
mathematical relationship between dose or exposure and biologically-based 
model parameters such as the aforementioned rates. While the above step is 
natural it seems possible that models that more faithfully reflect the actual 
mechanistic interactions between potential toxic agents and cells at a 
molecular level could have increased credibility over wider ranges of dosage. 
In particular, the low-dose and across-species response is of interest in risk 
assessment; see below; see Portier (1989), especially pp. 256-259. One 
approach in this general direction has been reported by Freedman and Shukla 
(1991), wherein references to other related work are given. 

An alternative class of biologically and physiologically based models are 
those that strive to represent the behavior of some distinguished organ, 
particularly when the latter is subject to toxic insult. A well-studied class 
concerns the liver; a good overview of the "distributed parallel tube" liver 
model (and its predecessors and competitors) is given by Robinson (1992) . Such 
models view the liver as a collection of enzyme-lined tubes through which 
blood flows carrying a substrate to be removed by enzyme action. Natural 
functional heterogeneities in the liver require formal mathematical 
recognition: this can be accomplished by the addition of one model parameter 
in a partially randomized setup. 
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6 . RISK ASSESSMENT 



An important application for the models discussed here is to quantitative 
risk assessment . The general objective of risk assessment is to estimate the 
type and incidence of detrimental biological effects associated with the 
introduction of various levels of pollutants into the environment. Of major 
but not exclusive concern are the biological effects upon human beings; the 
term pollutants refers not only to chemical and biological substances but also 
to ionizing and electromagnetic radiation and other physical agents, and to 
any combinations thereof. The term "biological effects" has often referred to 
cancer, but should also include genetic and developmental defects and other 
detrimental outcomes including psychological and behavioral abnormalities. Of 
course all such outcomes can in principle occur, in various combinations and 
severities, depending upon the exposure, and the nature and current status of 
the recipient of that exposure. 

The risk attributable to a particular agent or substance can be thought 
of as a combination of the hazard or detrimental effect, i.e. toxicity, of 
that agent, given a level of exposure, and the extent and pattern of exposure 
to the agent. The task of risk assessment is to identify agents whose toxicity 
is a health threat to specified subpopulations, given exposure or dosage at 
specified levels, and to estimate the likely extent of exposure (dosage) of 
those subpopulations. Quantitative risk assessment then often presents its 
conclusions in simple numerical form. The stark form of the numerical 
statements typically quoted in the news media provokes concern by the general 
public, and skepticism by scientists who are aware of the various inferential 
difficulties encountered in obtaining those numbers; see Wall Street Journal , 
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August 6, 1992, and also Freedman and Zeisel (1988) . Nevertheless, attempts to 
quantify suitably defined risks will be intensively sought in a Continuing 
attempt to rationalize and communicate, and in particular to assess the 
effects of both modifying pollutant introduction into the environment, and of 
environmental remediation. Problems of cost-effective risk reduction and of 
wise resource allocation are of great interest and concern; see The Economist 
(1992) , and Keunreuther (1991) for more on the economics of waste and 
pollution management, and the cost-effectiveness of regulations as opposed to 
taxes on polluting technologies. 

Efforts to improve the quality and credibility of risk assessment results 
have taken several forms. One is to clearly express the current state of 
scientific knowledge concerning cancer initiation mechanisms; see Hart et al. 
(1986) for example. The Hart committee's summary is in the form of 31 
principles that address mechanisms of carcinogenesis, tests of cancer 
induction, epidemiology, exposure assessment, and risk assessment. The general 
multi-stage nature of the cancer development process has been recognized as 
the first principle derived from the mechanisms of carcinogenesis in the above 
review, and has been incorporated into the biologically-based pharmacodynamic 
models described in Section 5. 

Assessment of human response to low doses of possibly toxic agents, alone 
or in combination, is an important regulatory issue. Realistic estimates of 
exposures to hazardous agents experienced by human beings in real-world 
conditions are often comparatively low, if protracted. Furthermore, exposure 
to chemical toxicants in addition to normal background exposures may well be 
at a comparatively low level. However, available experimental results with 
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laboratory animals are typically at a relatively high dose level, i.e. near to 
a maximum tolerable dose for the species or somewhat below. The problem of 
extrapolation from such laboratory results to the low values of risk- 
assessment interest poses difficult questions that have not been totally 
satisfactorily addressed in many cases. 

For example, interest has focused on the possible existence of a 
threshold dose, a concentration below which a particular toxic agent would 
have zero response. The position of the Hart committee is expressed in its 
principle 3: "At the present state of knowledge, mechanistic considerations 
such as DNA repair and other biological responses, in general, do not prove 
the existence of, the absence of, or the location of a threshold for 
carcinogenesis . " 

Low-dose threshold phenomena must be experimentally investigated by 
actually submitting sets of experimental animals to a sequence of lower and 
lower doses, observing the numbers that respond, and interpreting the dose- 
response trend. Recent experimental work investigates the dose-response 
relationship of N-nitrosodiethylamine (NDEA) and N-nitrosodimethylamine (NDMA) 
and the development of esophageal and liver cancer, and other types as well. 
In this study, Peto et al. (1991a, 1991b), a large number, 4080, of 
Colworth/Wistar rats were given varying, but in particular low, drinking-water 
doses of the above agents, and the incidence of neoplasms, e.g. in the liver, 
was noted. Conclusion: at low doses the fraction of animals exhibiting 
lifetime neoplasms was nearly proportional to dose, "with no indication of any 
'threshold' . " 
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In another very recent study, Portier et al. (1992) investigated the 
response to dioxin, 2, 3, 7, 8-TCDD, administered regularly (biweekly) after DEN 
initiation, of female Sprague-Dawley rats. The response was taken to be 
concentration of a dioxin-mediated protein in the liver. This concentration 
was predicted using two biologically-plausible mathematical models, an 
’‘additive" and an ’'independent" version. The additive model was interpreted to 
fit study data better than the "independent" model, but both fit adequately, 
and in neither case was there strong evidence of a threshold or strongly 
sigmoidal non-linear response. From a policy viewpoint this finding is 
interpreted to mean that safe exposure levels are lower than would be the case 
if a threshold were demonstrable. 

The Portier et al. (1992) study assesses the uncertainty associated with 
its inferences by use of the re-sampling or "bootstrapping" methodology 
referred to earlier; see Efron and Tibshirani (1986) . In Peto et al. (1991b) a 
formal two-parameter Weibull model is fitted but no uncertainty statements 
seem to be made about parameter estimates or extrapolated low-dose responses; 
perhaps this omission merely reflects the authors' skepticism concerning the 
parametric models' validity at truly low doses. 

Conventional statistical attention to random sampling uncertainty, e.g. 
as addressed by bootstrapping, largely ignores the effect of model 
uncertainty. However, concern with interest and attention to this feature of 
much quantitative risk assessment is naturally evident. Note that an ultimate 
objective often is to provide credible and defensible cross-species "mouse to 
man" extrapolations of the effect of suspected toxins or carcinogens. It seems 
to be generally agreed that it is reasonable to regard "chemicals for which 
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there is sufficient evidence of carcinogenicity in animals as if they 
presented a carcinogenic risk to humans, " see Hart et al. (1986), principle 8. 
However, this statement is vague and there are difficulties with its 
quantitative and qualitative interpretation; see Freedman and Zeisel (1988) 
and accompanying (discordant) discussion. See also Crouch and Wilson (1979), 
and Crouch, Wilson, and Zeise (1987) who have tried to show a statistical 
association between mouse and rat response; the reality of that association 
has been questioned by D. Freedman (Stanford statistics colloquium, 1992) . 
More recently Talcott (1992) has pointed out the many places in an 
environmental risk assessment that are vulnerable to uncertainties, and 
recommends systematic attention to these so as to limit the use of arbitrary 
conservative assumptions and safety factors. 

In summary, quantitative risk analysis based on models used for 
extrapolation of animal experiment dose responses to other species, notably 
humans, is demonstrably still somewhat inexact. The recognition of this fact 
has stimulated further research on the fundamental biological phenomena, and 
consequently more attention to development of mathematical models faithful to 
that of phenomena. Such work demands, and stimulates, fruitful interplay 
between representatives of different scientific disciplines. Issues of 
international, national, and local economics stimulate the acquisition of 
sound information on the risks potentially associated with exposures to 
substances produced by our technologies, so that cost-effective choices can be 
knowledgeably made. 
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7 . EPIDEMIOLOGIC CASE STUDIES 

In this concluding section we review approaches taken to assessing 
potential human risk from contaminated water in two different areas of the 
United States: Woburn, Massachusetts, and Battle Creek, Michigan. The 
discussion points up the difficulties of such epidemiological studies. 
Uncertainties include imperfect knowledge concerning exposure, and dose- 
response relationships when doses are chronic and to several toxic compounds 
when also many routes of entry and responses are possible. Nevertheless, 
serious attempts at quantitative assessments are valuable in that they focus 
energy on specific issues and questions, and on the revealed deficiencies in 
data and theory that are candidates for improvement. Recognition of sources of 
uncertainty in assessments also contributes to better understanding of the 
value of these assessments, and stimulates efforts to reduce and quantify 
uncertainties . 

Case 1: The Woburn Well Water Case (Lagakos, et al., 1986) 

We summarize and discuss the model-based statistical analysis of a water- 
associated ecotoxicity situation; Lagakos, et al. (1986). To summarize: in 
1979 it was discovered that two out of eight water wells servicing Woburn, 
Massachusetts, were contaminated with various organics: trichloroethylene, 
tetrachloroethylene, chloroform, tricholorotrif luoroe thane, and dichloro- 
ethylene. Groundwater tests under eastern Woburn, where the two implicated 
wells (designated G and H) were located, revealed EPA-priority pollutants. The 
wells were closed in 1979. Subsequent studies by the Massachusetts Department 
of Public Health revealed a cancer mortality rate for Woburn that was 
significantly higher than that for the state and the six adjacent communities. 
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Woburn's childhood leukemia rate appeared to be elevated for the 1969-1979 
period: 12 cases were diagnosed when 5,3 were expected (p-value * 0.008), 

Lagakos et al. obtained data and performed statistical analyses to assess 
possible association between access to water from wells G and H and incidence 
rate of childhood leukemia. They also attempted to relate such water 
consumption to adverse pregnancy outcomes and childhood disorders; their logic 
was that the latter health effects are of shorter latency than leukemia and 
thus may be more sensitive indicators than is leukemia incidence. 

The data that were analyzed statistically consisted first of 20 cases of 
childhood leukemia diagnosed in Woburn over roughly the G and H pumps' active 
period (1964-19&3>. Exposure to G and H water was scored by year and 
cumulatively/ according to the residence history. A telephone sample survey 
was conducted in 1982 in order to obtain information on incidence of APO/CD 
during the period 1960-1982/ along with mother's residence history. Care was 
taken in the survey/ but since it managed to contact something like 50% of 
Woburn residents with listed phones, and since the many types of adverse 
responses were grouped into categories, the use of the survey data was 
criticized by some discussants of the paper. 

Statistical analysis of the 20 leukemia cases was conducted with the aid 
of a statistical model, the so-called Cox failure time regression model; Cox 
1972. That model relates the age-dependent rate of early occurrence of 
leukemia to exposure. Strength of association is measured by the degree of 
positivity of a regression parameter, which turned out to be positive with 
moderate statistical significance. A logistic regression approach when applied 
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to data on pregnancy outcomes and childhood disorders also indicated an 
association of some of these responses and access to G, H water. 

Lagakos et al . carefully attempted to check for survey biases, such as 
could be caused by overreporting among those exposed to G, H , and 
underreporting among those unexposed; they concluded that these biases were 
negligible. (Note: this effort did not satisfy at least one discussant of the 
paper.) The effect of inexact exposure estimation was also assessed by redoing 
calculations based on a coarser partitioning of G, H exposure levels than that 
first used. Such a step resembles classical errors-in-variables strategies, 
e.g. the Wald grouping method, of . Fuller (1987). The procedure resulted in 
the same significant associations as previously detected. 

In summary, inference concerning risk of leukemia and CD/APO was 
conducted using data on exposure to Woburn wells G and H and mathematical 
models deemed appropriate for such investigation. The investigations concluded 
that their analysis, while showing positive association, did not explain the 
entirety of Woburn's leukemia excess. Little evidence was found of increase in 
spontaneous abortion or low birth weight, with increased G, H exposure, but 
perinatal/stillbirth rate was up, as were (strongly) eye/ear and 
chromosomal/oral cleft anomalies. Other positive associations were found as 
well . 

Six discussants commented upon the reported study. All were free in 
pointing out deficiencies, many of which were acknowledged by the authors. 
Prominent among the deficiencies noted were (apparently unavoidable) 
difficulties with exposure assessment and the survey data, and the possibility 
of overinterpretation of positive indications of association because of the 
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"multiple testing" phenomenon; also, some doubt was cast on the accuracy of 
approximations used to calculate p-values, and to the sensitivity of the 
latter values to deletion/addition of single observations. All such valid 
comments contribute to a better understanding of the difficulties of 
conducting convincing studies of environmental risk; their recognition 
presumably will lead to improvements in future studies. 

Case 2: Battle Creek Health Study (Freni and Bloomer, 1988) 

In 1981 an aquifer servicing the Battle Creek, Michigan, area was found 
to be contaminated with various volatile organic chemicals (VOCs) . The wells 
were subsequently closed. Groundwater contamination with the same VOCs was 
later detected in Dowagiac and in Springfield, adjacent to Battle Creek. The 
Michigan Department of Public Health proceeded to conduct a comprehensive 
epidemiologic study of the potential health effect of the contaminated 
drinking water; see Freni and Bloomer (1988); this is called the Battle Creek 
Health Study. 

An initial literature review indicated that adverse effects of chronic 
exposure to particular VOCs had been observed only at levels much higher than 
those discovered in the Battle Creek drinking water. However, this information 
did not extend to situations involving multiroute exposure to multiple VOCs. 
Consequently, a retrospective cohort study was designed: a cohort of exposed 
people was compared to a reference or control cohort of unexposed people with 
respect to incidence of diseases or other health parameters during a follow-up 
period. The reference cohort was from neighborhoods comparable to the 
contaminated areas with respect to age, size, and value of dwellings; the 
individuals selected for both cohorts were comparable with respect to age and 
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sex. There were some refusals to participate (about 20% in exposed, and 30% in 
reference, cohorts), a common occurrence in such studies that is potentially 
biasing. 

The quality of raw data on extent of exposure of the exposed cohort was 
"extremely poor", according to Freni and Bloomer (1988) . It was necessary to 
construct a mathematical model to infer from the results of available well- 
monitoring data the time of start of residential well contamination and the 
total accumulated exposure (TAEVOC) of individuals; these latter estimates 
were supplemented by interview data, and when possible converted to inferred 
dosage in drinking water (TAEDOSE) ; estimated dosages varied considerably 
across individuals. As was true in the Woburn Study, dosage levels were 
indirect, and hence uncertain, although the Battle Creek Study devoted much 
thought and energy to quantifying individual exposure by drinking water but 
also from showering and bathing. 

Health' data were obtained from several sources: interviews, in which 
subjects were asked to recall diseases experienced in the past, and their date 
of diagnosis; medical records ; clinical examination for subjects at least five 
years old; mortality and hospital discharge rates. Efforts to obtain a variety 
of possibly informative data seemed greater than those made in the Woburn 
Study. 

It was judged likely that an interview bias existed: those in exposed 
areas, who became aware of possible exposure through the media, by word of 
mouth, or when interviewed, may recall more disease than others. However, 
analysis of the uncorrected 1976-1980 mortality data for Battle Creek is 
reported to have shown that the city had "significantly higher" rates for 
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eight of the state's ten leading causes of death; these include heart disease 
and cancer. The magnitudes of the effects are not reported. 



Statistical models were brought into play to analyze the above data. The 
simplest and most traditional of these is the odds ratio, defined in terms of 
these numbers: for specified risk factor, R, and specified disease (or 

indicator thereof), D, denote 

n DR “ Number exposed to risk factor R that exhibit disease D; 

n i>R “ Number exposed to risk factor R that do not exhibit disease D; 

n DR m Number not exposed to risk factor R that exhibit disease D; 

n t>R = Number not exposed to risk factor R that do not exhibit disease D. 

Then the odds ratio (for D, given R) is computed as 

Odds for D, given R _ {^Dr/^Dr) 



OR (£>; R) = 



Odds for D, given not-R ( n DR/ n PR ) 



For example: Suppose 250 individuals (= study participants) consumed water 
with high VOC concentration (risk factor R ) ; of these 50 ** n^R exhibited liver 

disease. A reference or control group of 500 individuals consumed water with 
low or normal VOC concentration; of these 10 « n D % exhibited liver disease. 



Then the odds ratio for liver disease, D, 
is 

50/200 

OR = — = 

10/4 90 



given the high-VOC risk factor, 
12.3, 



R, 



an attention-getting number that strongly suggests an association of 
liver disease, D, with high-VOC risk factor, R. (No such levels were found in 
the Battle Creek Study.) A numerical value close to unity indicates neutrality 
so far as effect of the risk factor goes. A numerical factor less than unity 
suggests that absence of the specified risk factor is associated with greater 
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disease prevalence in the sample analyzed. This effect nay be caused by the 
action of an unsuspected additional risk factor in the reference group. The 
odds ratio statistic is very frequently employed in the Battle Creek Study, 
often adjusted for estimated exposure or dosage by stratification. In such 
cases the numbers in the various categories are small, so the sampling 
variability is large. As a result, only a few of the calculated odds ratios 
reach statistical significance at the (modest) level p ■ 0.10. 

Another frequently-utilized statistical measure of effect was the rate 
ratio or relative risk; in words, 

RR(D- R) - Number exhibiting D per person-month exposed to R 
Number exhibiting D per person-month of not-K 

This index controls broadly for exposure. 

Statistical modeling of realistically individually variable exposure or 
dosage responses were conducted by multivariable regression analysis , 
particularly the proportional hazard or Cox model , see Cox (1972), and the 
logistic regression model , Cox and Oakes (1984) . These models were also 
utilized in the Woburn Study; both nay now be fitted to data using standard 
package computer programs, as may a variety of other relevant generalized 
linear models ; see McCullagh and Nelder (1983); the latter techniques were 
available in 1986, and their application to the Battle Creek data could be of 
interest . 

The statistical procedures carried out on the data involved the above 
models when viewed as appropriate, plus others. The simple general conclusion 
was that no positive or significant (below the p * 0.1 level) relationship 
between exposure to VOCs and adverse effects on health. In fact, a weak 
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reverse effect was noted: the data suggest an excess of abnormal response 
values in the reference or control cohort. This effect has not been explained. 

Discussion 

Despite careful attempts to control the relevance and quality of the data 
used, the associated health effects with contamination levels in both Woburn 
and Battle Creek appeared small. Furthermore, they often did not reach 
conventional levels of statistical significance. 

Such outcomes may well have several explanations. One is that, despite 
considerable care, the data obtainable retrospectively on individual dosages 
and responses were simply inadequate to allow the detection of rather weak and 
individual-specific effects. Small effects are difficult to detect when 
exposed populations are homogeneous, but the samples studied may not have been 
homogeneous enough in response to the "contaminant treatment" level under 
investigation, to reveal the latter's influence over and above that of the 
prevailing background. Furthermore, the more sophisticated regression models 
used do not take account of the fact that data on dosage and health-effect 
responses are surrogate, in that they represent the variation of causal 
variables indirectly, in broad error-afflicted summary, and they are not 
designed to reflect the variation of individual responses to contaminants and 
background. 

To make further progress in quantitative epidemiologic studies it appears 
necessary to obtain increasingly pertinent and inclusive dose and response 
data, where guidance for what data are needed will derive from improved 
understanding of biological response to chemical intruders. That same 
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understanding will permit the construction of models that can better be relied 
upon to predict health effects in a trustworthy manner. 

8 . SUMMARY AND CONCLUSIONS 

This chapter furnishes an overview of a selection of the many 
mathematical models used in representing the flow, transformation, and fate of 
toxic substances in the environment, and eventually within organisms such as 
the human body. The presentation is in no way encyclopedic; for example, there 
is no discussion of air pollution generation, transport, and human exposure, a 
significant omission. References are provided to repositories of models of 
transport and exposure. The functionality of dose-response models, 
particularly for cancer, is described in some literary detail, without 
mathematical elaboration; references will allow interested readers access to 
details. Such mechanistic models are not now widely available at a fundamental 
level for many other health effects, a deficiency which appears to offer 
multifold research opportunities. 

The users of models must face the task of linking or integration of 
models of various stages, from pollution generation, through transport, 
dispersion, and transformation, to eventual exposure and dosage of an organism 
to produce various health effects. The literature on methodology for this 
important linkage problem, with its attendant uncertainties, appears to be 
minuscule; see, however. Smith in Chapter 5 of Bloom and Poskitt (1988) who 
examine DNA adduct formation caused by ethylene oxide. This work emphasizes 
mechanistic differential equation tools at the expense of attempts to 
characterize inter-individual, and other, variability by stochastic models. It 
provides a start on the road towards credible interspecies extrapolation. 
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Through the consideration of two case studies concerning health effects 
or risks associated with contaminated drinking water, we appreciate the 
difficulty of establishing a causal link between presence of an elevated 
contaminant level and strong evidence of corresponding health effects. The 
credible reconstruction of historical exposure from available data is seen to 
be essential, but difficult and fraught with often unquantified uncertainty. 
Additionally, the variable susceptibility of those exposed can only add to the 
difficulty of linkage. This area appears to be a prime candidate for future 
research. For example, data on the response to toxic chemical agents in 
drinking water of individuals who are suffering from various forms of disease 
might well reveal magnified health effects. To learn this, the appropriate 
data must be available, and be analyzed. Eventually, after the physical 
transport and biological mechanisms are well enough understood, the impact of 
multiple results may be usefully anticipated and appropriate steps taken. More 
research appears necessary before this is reliably possible. 

In various branches of science and technology it has been found useful to 
consider together, and possibly to judiciously combine, data on similar 
situations so as to achieve stronger quantitative estimates of effects; see 
Gaver et al. (1992). In medical and social science circles in the U.S. this 
activity is frequently called meta analysis; the term overviews is preferred 
by the prominent British biostatistician R. Peto. It appears likely that the 
techniques of this field, notably but not exclusively hierarchical Bayesian 
analysis, will find a place in the toolkits of those who analyze environmental 
transport and fate, exposure, and dose-response relations and data. The basic 
notion is that at least certain classes of situations, such as certain 
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Superfund sites eligible for cleanup, may have features in common that can be 
invoked to strengthen individual assessments of cleanup status assessment, for 
example. This would involve suitably combining (not uncritical pooling) of 
data from several similar sites with that of a specific site of interest so as 
to improve the estimate of its condition after a particular effort has been 
made. Likewise, assessments of health effects of certain pollutants that have 
been conducted at different times or places might be profitably brought 
together in an overall analysis. Such actions have precedent, but do not now 
appear to be commonly employed. For a status report on meta analysis in 
various fields, see Gaver, et ai. (1992). 

As was stated initially, those who make quantitative studies of 
environmental risk are essentially never able to conduct planned or designed 
experiments on human subjects. Consequently they are confronted with 
observational data that potentially contain biasing and confounding factors. A 
good account of effective statistical analyses of such difficult data is given 
by Cochran (1983) (edited and completed by L. Moses and F. Mosteller) . This 
material should not be ignored by those health-effects analysts attempting to 
draw conclusions from environmental and observed response data. In particular, 
the careful use of modern nonlinear regression techniques in company with 
appropriately biologically motivated mechanistic models is to be encouraged in 
order to adjust for measured covariates such as age, gender, weight, or others 
while also properly modeling the biological phenomena. Recent work by 
Piegorsch and Casella (1992) is in this direction; they consider mouse 
genotoxicity data in the context of a hierarchical logistic or binomial model. 
See also the references in Sections 4, 5, and 6 above. 
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The aim of this chapter has been to describe some of the ways in which 
quantitative modeling and analytical techniques have been applied in the 
environmental areas of concern in this book. It is hoped that the result will 
be stimulating and helpful, especially to those new to this approach. 
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